Answering Top-k Queries Over a Mixture of Attractive and Repulsive Dimensions

نویسندگان

  • Sayan Ranu
  • Ambuj K. Singh
چکیده

In this paper, we formulate a top-k query that compares objects in a database to a user-provided query object on a novel scoring function. The proposed scoring function combines the idea of attractive and repulsive dimensions into a general framework to overcome the weakness of traditional distance or similarity measures. We study the properties of the proposed class of scoring functions and develop efficient and scalable index structures that index the isolines of the function. We demonstrate various scenarios where the query finds application. Empirical evaluation demonstrates a performance gain of one to two orders of magnitude on querying time over existing state-of-the-art top-k techniques. Further, a qualitative analysis is performed on a real dataset to highlight the potential of the proposed query in discovering hidden data characteristics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...

متن کامل

Linear Sketches for Approximate Aggregate Range Queries

Answering aggregate queries approximately over multidimensional data is an important problem that arises naturally in many applications. An approach to the problem is to maintain a succinct (i.e. O(k) space) representation, called sketch, of the frequency distribution h of the data, and use ĥ for answering queries. Common sketches are constructed via linear mappings of h onto a k–dimensional sp...

متن کامل

A Generic Framework for Top-k Pairs and Top-k Objects Queries over Sliding Windows

Top-k pairs and top-k objects queries have received significant attention by the research community. In this paper, we present the first approach to answer a broad class of top-k pairs and top-k objects queries over sliding windows. Our framework handles multiple top-k queries and each query is allowed to use a different scoring function, a different value of k and a different size of the slidi...

متن کامل

Answering Top K Queries Efficiently with Overlap in Sources and Source Paths

Challenges in answering queries over Web-accessible sources are selecting the sources that must be accessed and computing answers efficiently. Both tasks become more difficult when there is overlap among sources and when sources may return answers of varying quality. The objective is to obtain the best answers while minimizing the costs or delay in computing these answers and is similar to solv...

متن کامل

Encoding Two-Dimensional Range Top-k Queries

We consider various encodings that support range Top-k queries on a two-dimensional array containing elements from a total order. For an m × n array, with m ≤ n, we first propose an almost optimal encoding for answering one-sided Top-k queries, whose query range is restricted to [1 . . .m][1 . . . a], for 1 ≤ a ≤ n. Next, we propose an encoding for the general Top-k queries that takes m2 lg ((k...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2011